Whole-genome phylogeny of mammals: evolutionary information in genic and nongenic regions.

نویسندگان

  • Gregory E Sims
  • Se-Ran Jun
  • Guohong Albert Wu
  • Sung-Hou Kim
چکیده

Ten complete mammalian genome sequences were compared by using the "feature frequency profile" (FFP) method of alignment-free comparison. This comparison technique reveals that the whole nongenic portion of mammalian genomes contains evolutionary information that is similar to their genic counterparts--the intron and exon regions. We partitioned the complete genomes of mammals (such as human, chimp, horse, and mouse) into their constituent nongenic, intronic, and exonic components. Phylogenic species trees were constructed for each individual component class of genome sequence data as well as the whole genomes by using standard tree-building algorithms with FFP distances. The phylogenies of the whole genomes and each of the component classes (exonic, intronic, and nongenic regions) have similar topologies, within the optimal feature length range, and all agree well with the evolutionary phylogeny based on a recent large dataset, multispecies, and multigene-based alignment. In the strictest sense, the FFP-based trees are genome phylogenies, not species phylogenies. However, the species phylogeny is highly related to the whole-genome phylogeny. Furthermore, our results reveal that the footprints of evolutionary history are spread throughout the entire length of the whole genome of an organism and are not limited to genes, introns, or short, highly conserved, nongenic sequences that can be adversely affected by factors (such as a choice of sequences, homoplasy, and different mutation rates) resulting in inconsistent species phylogenies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions.

For comparison of whole-genome (genic + nongenic) sequences, multiple sequence alignment of a few selected genes is not appropriate. One approach is to use an alignment-free method in which feature (or l-mer) frequency profiles (FFP) of whole genomes are used for comparison-a variation of a text or book comparison method, using word frequency profiles. In this approach it is critical to identif...

متن کامل

Evolutionary constraints in conserved nongenic sequences of mammals.

Mammalian genomes contain many highly conserved nongenic sequences (CNGs) whose functional significance is poorly understood. Sets of CNGs have previously been identified by selecting the most conserved elements from a chromosome or genome, but in these highly selected samples, conservation may be unrelated to purifying selection. Furthermore, conservation of CNGs may be caused by mutation rate...

متن کامل

Genic and nongenic contributions to natural variation of quantitative traits in maize.

The complex genomes of many economically important crops present tremendous challenges to understand the genetic control of many quantitative traits with great importance in crop production, adaptation, and evolution. Advances in genomic technology need to be integrated with strategic genetic design and novel perspectives to break new ground. Complementary to individual-gene-targeted research, ...

متن کامل

Toll Evolution: a Perspective from Regulatory Regions

Background: Toll and Toll-related proteins play an important role in antibacterial innate immunity and are widespread in insects, plants, and mammals. The completion of new genomes such as Anopheles gambiae has provided an avenue for a deeper understanding of Toll evolution. While most evolutionary analyses are performed on protein sequences, here, we present a unique phylogenetic analysis of T...

متن کامل

Normalized Information Distance and Whole Mitochondrial Genome Phylogeny Analysis

A new class of similarity measures aimed at measuring the evolutionary relation of sequences is studied. A prime example is the “normalized information distance”, based on the noncomputable notion of Kolmogorov complexity. We demonstrate that it is a metric, takes values in [0, 1], and is universal. To apply it (and some related metrics) we use a simple approximation scheme to computationally c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 106 40  شماره 

صفحات  -

تاریخ انتشار 2009